As ecosystems are facing challenges from anthropogenic sources including land use changes and habitat destruction, methods for their re-establishment are being investigated.
Assisting the recovery of and building resilience in ecosystems requires the study of how ecological communities respond to different environmental conditions including drought. According to the UK Centre for Ecology and Hydrology, grasslands make up 40% of the UK’s land surface while grassland restoration has been identified as a means of increasing biodiversity and reducing carbon emissions. Soils have been identified as one of the most biodiverse habitats, and are home to between 44% and 74% of all species on Earth. Soil microbial communities, including bacteria and fungi, provide ecosystem services such as nutrient cycling, decomposition and regulating carbon storage, which are affected by changes in soil moisture. According to the Environment Agency, soil moisture deficits across the UK have increased in March and April 2025. Additionally more frequent droughts are predicted due to climate change, therefore a greater understanding of how soil microbial communities respond to changes in soil moisture is key to building resilience in grassland ecosystems. The aim of this study is to investigate how microbial species diversity, community composition and relationships between bacterial and fungal communities in grassland soils respond to changes in soil moisture levels. DNA sequences from bacterial 16S rRNA and fungal ITS from soil samples collected by the Restoring Resilient Ecosystems (RestREco) ecological restoration project will be analysed using bioinformatics tools. The methods will include running computational analysis tools using the Crescent2 high performance computing facilities and/or RStudio, testing hypotheses using statistical tests and placing findings in the context of current literature in scientific journals.
This page focuses on alpha and beta diversity and taxonomy.
For differential abundance results, visit: https://kerrycranfield.github.io/RestREcoDrought/restreco_diff_abund.html
For functional analysis, visit: https://kerrycranfield.github.io/RestREcoDrought/restreco_func.html
RestREco (Restoring Resilient Ecosystems) is a NERC-funded research project that aims to identify ecological restoration practices that can build resilience in ecosystems, particularly woodlands and grasslands. The initiative brings together researchers from:
RestREco studies a network of 133 ecological restoration sites across England and Scotland. The project splits its work into seven work packages, each focusing on different factors and relating them to ecosystem characteristics that arise from component interactions. It aims to identify key drivers of ecosystem development, including:
The goal is to understand how these factors influence ecosystem complexity, function, and resilience to future pressures (RestREco, 2024).
This study uses targeted amplicon sequences (16S, ITS), retrieved from soil samples, to determine how drought affects soil microbial communities, specifically bacteria and fungi, in UK grasslands.
This study specifically focuses on:
A total of 72 soil samples were collected across six sites in England for each marker (6 per site: 3 control, 3 shelter).
Some fungal samples failed the initial quality control during the sequencing process, resulting in the removal of one site from the analysis (for fungi only), while other sites where one or two samples failed quality, were kept in (also for fungi).
Sample Zone - Based on GPS coordinates
| Metric | 16S | ITS |
|---|---|---|
| Microbial group | Bacteria | Fungi |
| Region sampled | England | England |
| Number of sites | 6 | 5 |
| Samples per site | 6 | 6 |
| Total samples removed | 0 | 3 |
| Total samples | 36 | 27 |
| Total samples per treatment | 18 | 13 (for most sites) |
| Average reads per sample | ~66 000 | ~61 000 |
Sampling Summary
QIIME2 generated feature table, taxonomy file, metadata and rooted phylogenetic tree were imported into R (insert citation) using qiime2r and objects for bacteria and fungi were generated using R phyloseq library (insert citation). General statistics were generated using the microbiome package (insert citation).
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 28079 taxa and 36 samples ]
## sample_data() Sample Data: [ 36 samples by 4 sample variables ]
## tax_table() Taxonomy Table: [ 28079 taxa by 7 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 28079 tips and 28055 internal nodes ]
## [[1]]
## [1] "1] Min. number of reads = 13222"
##
## [[2]]
## [1] "2] Max. number of reads = 32667"
##
## [[3]]
## [1] "3] Total number of reads = 865814"
##
## [[4]]
## [1] "4] Average number of reads = 24050.3888888889"
##
## [[5]]
## [1] "5] Median number of reads = 23675.5"
##
## [[6]]
## [1] "7] Sparsity = 0.9460915828753"
##
## [[7]]
## [1] "6] Any OTU sum to 1 or less? YES"
##
## [[8]]
## [1] "8] Number of singletons = 951"
##
## [[9]]
## [1] "9] Percent of OTUs that are singletons \n (i.e. exactly one read detected across all samples)3.38687275187863"
##
## [[10]]
## [1] "10] Number of sample variables are: 4"
##
## [[11]]
## [1] "Site" "Field_no" "Treatment" "Group_by"
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 6327 taxa and 25 samples ]
## sample_data() Sample Data: [ 25 samples by 4 sample variables ]
## tax_table() Taxonomy Table: [ 6327 taxa by 7 taxonomic ranks ]
The bacteria feature table was filtered to remove anything assigned as Archaea or unassigned.
##
## d__Archaea d__Bacteria Unassigned
## 19 28055 5
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 28055 taxa and 36 samples ]
## sample_data() Sample Data: [ 36 samples by 4 sample variables ]
## tax_table() Taxonomy Table: [ 28055 taxa by 7 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 28055 tips and 28031 internal nodes ]
The fungal feature table was also checked to ensure there were only fungi present.
##
## Fungi
## 6327
Both the bacterial and fungal feature tables were then agglomerated to genera level. Statistics such as read counts and sparsity metrics (proportion of cells where count = 0) were generated using the microbiome package. This was to see how the number of reads and taxa change with each step.
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1118 taxa and 36 samples ]
## sample_data() Sample Data: [ 36 samples by 4 sample variables ]
## tax_table() Taxonomy Table: [ 1118 taxa by 7 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 1118 tips and 1117 internal nodes ]
## [[1]]
## [1] "1] Min. number of reads = 13220"
##
## [[2]]
## [1] "2] Max. number of reads = 32660"
##
## [[3]]
## [1] "3] Total number of reads = 865546"
##
## [[4]]
## [1] "4] Average number of reads = 24042.9444444444"
##
## [[5]]
## [1] "5] Median number of reads = 23657.5"
##
## [[6]]
## [1] "7] Sparsity = 0.666790896442059"
##
## [[7]]
## [1] "6] Any OTU sum to 1 or less? YES"
##
## [[8]]
## [1] "8] Number of singletons = 3"
##
## [[9]]
## [1] "9] Percent of OTUs that are singletons \n (i.e. exactly one read detected across all samples)0.268336314847943"
##
## [[10]]
## [1] "10] Number of sample variables are: 4"
##
## [[11]]
## [1] "Site" "Field_no" "Treatment" "Group_by"
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 711 taxa and 25 samples ]
## sample_data() Sample Data: [ 25 samples by 4 sample variables ]
## tax_table() Taxonomy Table: [ 711 taxa by 7 taxonomic ranks ]
## [[1]]
## [1] "1] Min. number of reads = 30659"
##
## [[2]]
## [1] "2] Max. number of reads = 78553"
##
## [[3]]
## [1] "3] Total number of reads = 1329016"
##
## [[4]]
## [1] "4] Average number of reads = 53160.64"
##
## [[5]]
## [1] "5] Median number of reads = 54650"
##
## [[6]]
## [1] "7] Sparsity = 0.697383966244726"
##
## [[7]]
## [1] "6] Any OTU sum to 1 or less? NO"
##
## [[8]]
## [1] "8] Number of singletons = 0"
##
## [[9]]
## [1] "9] Percent of OTUs that are singletons \n (i.e. exactly one read detected across all samples)0"
##
## [[10]]
## [1] "10] Number of sample variables are: 4"
##
## [[11]]
## [1] "Site" "Field_no" "Treatment" "Group_by"
Feature tables were then aggregated to site level using speedyseq (insert citation) as phyloseq merge_samples combines abundances using sum. Median was felt to be more sample representative. Read counts were again generated to check the aggregation step was performed correctly and metadata was checked to ensure no NAs were coerced into the data.
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1118 taxa and 12 samples ]:
## sample_data() Sample Data: [ 12 samples by 4 sample variables ]:
## tax_table() Taxonomy Table: [ 1118 taxa by 7 taxonomic ranks ]:
## phy_tree() Phylogenetic Tree: [ 1118 tips and 1117 internal nodes ]:
## taxa are rows
## [[1]]
## [1] "1] Min. number of reads = 16398"
##
## [[2]]
## [1] "2] Max. number of reads = 28026"
##
## [[3]]
## [1] "3] Total number of reads = 276014"
##
## [[4]]
## [1] "4] Average number of reads = 23001.1666666667"
##
## [[5]]
## [1] "5] Median number of reads = 22886.5"
##
## [[6]]
## [1] "7] Sparsity = 0.692382230172928"
##
## [[7]]
## [1] "6] Any OTU sum to 1 or less? YES"
##
## [[8]]
## [1] "8] Number of singletons = 484"
##
## [[9]]
## [1] "9] Percent of OTUs that are singletons \n (i.e. exactly one read detected across all samples)0.0894454382826476"
##
## [[10]]
## [1] "10] Number of sample variables are: 4"
##
## [[11]]
## [1] "Site" "Field_no" "Treatment" "Group_by"
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 711 taxa and 10 samples ]:
## sample_data() Sample Data: [ 10 samples by 4 sample variables ]:
## tax_table() Taxonomy Table: [ 711 taxa by 7 taxonomic ranks ]:
## taxa are rows
## [[1]]
## [1] "1] Min. number of reads = 36120"
##
## [[2]]
## [1] "2] Max. number of reads = 56936"
##
## [[3]]
## [1] "3] Total number of reads = 481530"
##
## [[4]]
## [1] "4] Average number of reads = 48153"
##
## [[5]]
## [1] "5] Median number of reads = 48798.5"
##
## [[6]]
## [1] "7] Sparsity = 0.705344585091421"
##
## [[7]]
## [1] "6] Any OTU sum to 1 or less? YES"
##
## [[8]]
## [1] "8] Number of singletons = 203"
##
## [[9]]
## [1] "9] Percent of OTUs that are singletons \n (i.e. exactly one read detected across all samples)0.281293952180028"
##
## [[10]]
## [1] "10] Number of sample variables are: 4"
##
## [[11]]
## [1] "Site" "Field_no" "Treatment" "Group_by"
Rarefaction curves were then generated to find sampling depths using the rarecurve function in the vegan package to generate a tibble friendly data frame to be passed to ggplot2 to generate attractive rarefaction plots.
Rarefaction was applied to the bacterial and fungal feature tables. The bacteria feature table was rarefied to a sampling depth of 16 398 while fungi was rarefied to a sampling depth of 36 120.
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 627 taxa and 12 samples ]:
## sample_data() Sample Data: [ 12 samples by 4 sample variables ]:
## tax_table() Taxonomy Table: [ 627 taxa by 7 taxonomic ranks ]:
## phy_tree() Phylogenetic Tree: [ 627 tips and 626 internal nodes ]:
## taxa are rows
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1084 taxa and 36 samples ]:
## sample_data() Sample Data: [ 36 samples by 4 sample variables ]:
## tax_table() Taxonomy Table: [ 1084 taxa by 7 taxonomic ranks ]:
## phy_tree() Phylogenetic Tree: [ 1084 tips and 1083 internal nodes ]:
## taxa are rows
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 509 taxa and 10 samples ]:
## sample_data() Sample Data: [ 10 samples by 4 sample variables ]:
## tax_table() Taxonomy Table: [ 509 taxa by 7 taxonomic ranks ]:
## taxa are rows
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 708 taxa and 25 samples ]:
## sample_data() Sample Data: [ 25 samples by 4 sample variables ]:
## tax_table() Taxonomy Table: [ 708 taxa by 7 taxonomic ranks ]:
## taxa are rows
Bar plots showing taxonomic composition at a phylum, class and family were generated for each variable being assessed: treatment, site and group (site-treatment). Functions for each type of bar plot were created.
Function for bar plot at family level (cutoff for grouping was set to 2% as a colour palette large enough to show all families present was not available at the time:
Alpha diversity calculations were run for both bacteria and fungi to determine if there were differences in diversity between sites, treatments and groups.
Alpha diversity plots were generated
## Chao1 se.chao1 Shannon Simpson
## HARLEY_FARMS_new2b.Control 362.3333 3.4039503 4.802679 0.9839348
## HARLEY_FARMS_new2b.Drought 340.6400 2.8689114 4.637164 0.9803463
## WINDMILL_farm.Control 312.0526 0.2553923 4.748467 0.9840634
## WINDMILL_farm.Drought 402.3030 6.1951083 4.835666 0.9849811
## Castle_Field_West_W.Control 348.7500 2.4097307 4.895923 0.9871872
## Castle_Field_West_W.Drought 352.3333 1.5128057 4.866165 0.9865795
## Jemma_6.Control 368.2759 3.5716413 4.811119 0.9848627
## Jemma_6.Drought 349.3235 1.4677009 4.766784 0.9841443
## Jemma_9.Control 333.9545 4.5211664 4.634408 0.9828976
## Jemma_9.Drought 320.0385 3.0506240 4.589781 0.9819630
## Lardon.Chase.Control 321.6400 2.3337655 4.613960 0.9808540
## Lardon.Chase.Drought 320.5652 3.3753254 4.606299 0.9805645
## Chao1 se.chao1 Shannon Simpson
## P19C1 403.2195 4.88241506 4.862187 0.9848159
## P19C2 391.6774 5.31177472 4.876749 0.9853763
## P19C3 397.6875 8.70193023 4.816635 0.9844514
## P19S1 342.3636 3.95687115 4.619438 0.9798207
## P19S2 432.8077 5.84213144 4.854027 0.9845005
## P19S3 362.7895 8.35360799 4.656803 0.9803369
## P20C1 332.1364 0.41939654 4.867252 0.9861849
## P20C2 314.3333 3.31205690 4.732067 0.9839524
## P20C3 392.7632 6.83887130 4.791064 0.9842036
## P20S1 427.0000 6.37667826 4.899164 0.9856368
## P20S2 424.8750 10.08516817 4.842437 0.9846286
## P20S3 445.0196 9.85975813 4.824488 0.9836183
## P2C1 437.7660 6.86367011 5.016905 0.9883510
## P2C2 369.6667 4.41362368 4.900065 0.9870109
## P2C3 374.4545 5.89515694 4.871537 0.9867647
## P2S1 384.6579 3.98204181 4.911651 0.9866959
## P2S2 389.4839 9.33712710 4.877424 0.9871407
## P2S3 389.9767 4.02464189 4.913404 0.9870237
## P54C1 399.6923 5.09510775 4.920525 0.9864512
## P54C2 432.2857 15.30696899 4.830480 0.9852618
## P54C3 402.5714 6.00027513 4.787565 0.9833969
## P54S1 405.5227 6.25307921 4.818479 0.9842408
## P54S2 372.6154 7.71197696 4.777175 0.9841551
## P54S3 354.8857 3.29486351 4.747247 0.9837554
## P55C1 373.5312 7.00594123 4.677237 0.9826638
## P55C2 240.7143 2.99081547 4.275542 0.9759490
## P55C3 400.0000 7.68283400 4.824805 0.9856630
## P55S1 307.6364 0.95905767 4.584209 0.9821484
## P55S2 366.0857 7.06073958 4.678670 0.9830788
## P55S3 338.3333 4.66761917 4.641621 0.9826760
## P67C1 281.0000 0.01721067 4.600609 0.9815854
## P67C2 373.6383 4.56671019 4.653293 0.9805289
## P67C3 380.3077 3.27725056 4.806963 0.9844517
## P67S1 358.3333 4.66762509 4.679350 0.9825411
## P67S2 354.0000 7.21313132 4.572444 0.9789350
## P67S3 330.0370 6.35557578 4.574235 0.9798843
Post-hoc pairwise testing was carried out for the site and group variables using Dunn test as it is the recommended test to be run after Kruskal-Wallis:
Distances between samples were also calculated using ordinate method. The distance metrics selected were weighted/unweighted unifrac and Bray-Curtis
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = unifrac_dist.bac ~ Treatment + Site, data = metadata.bac, by = "terms")
## Df SumOfSqs R2 F Pr(>F)
## Treatment 1 0.05273 0.11294 1.6657 0.013 *
## Site 5 0.25586 0.54803 1.6165 0.001 ***
## Residual 5 0.15828 0.33902
## Total 11 0.46688 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = unifrac_dist.bac ~ Treatment + Site, data = metadata.bac, by = NULL)
## Df SumOfSqs R2 F Pr(>F)
## Model 6 0.30860 0.66098 1.6247 0.001 ***
## Residual 5 0.15828 0.33902
## Total 11 0.46688 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = wunifrac_dist.bac ~ Treatment + Site, data = metadata.bac, by = "terms")
## Df SumOfSqs R2 F Pr(>F)
## Treatment 1 0.003162 0.03713 0.8583 0.534
## Site 5 0.063599 0.74661 3.4523 0.002 **
## Residual 5 0.018422 0.21626
## Total 11 0.085184 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = wunifrac_dist.bac ~ Treatment + Site, data = metadata.bac, by = NULL)
## Df SumOfSqs R2 F Pr(>F)
## Model 6 0.066762 0.78374 3.02 0.001 ***
## Residual 5 0.018422 0.21626
## Total 11 0.085184 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = bray_dist.bac ~ Treatment + Site, data = metadata.bac, by = "terms")
## Df SumOfSqs R2 F Pr(>F)
## Treatment 1 0.009715 0.03179 0.7017 0.670
## Site 5 0.226687 0.74173 3.2749 0.001 ***
## Residual 5 0.069219 0.22649
## Total 11 0.305621 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = bray_dist.bac ~ Treatment + Site, data = metadata.bac, by = NULL)
## Df SumOfSqs R2 F Pr(>F)
## Model 6 0.236401 0.77351 2.846 0.001 ***
## Residual 5 0.069219 0.22649
## Total 11 0.305621 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = bray_dist.fun ~ Treatment + Site, data = metadata.fun, by = "terms")
## Df SumOfSqs R2 F Pr(>F)
## Treatment 1 0.04056 0.05217 1.0789 0.423
## Site 4 0.58654 0.75441 3.9005 0.001 ***
## Residual 4 0.15038 0.19342
## Total 9 0.77748 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = bray_dist.fun ~ Treatment + Site, data = metadata.fun, by = NULL)
## Df SumOfSqs R2 F Pr(>F)
## Model 5 0.62710 0.80658 3.3361 0.004 **
## Residual 4 0.15038 0.19342
## Total 9 0.77748 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = jacc_dist.fun ~ Treatment + Site, data = metadata.fun, by = "terms")
## Df SumOfSqs R2 F Pr(>F)
## Treatment 1 0.08834 0.06436 1.0251 0.431
## Site 4 0.93962 0.68451 2.7256 0.002 **
## Residual 4 0.34473 0.25114
## Total 9 1.37270 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation test for adonis under reduced model
## Permutation: free
## Number of permutations: 999
##
## adonis2(formula = jacc_dist.fun ~ Treatment + Site, data = metadata.fun, by = NULL)
## Df SumOfSqs R2 F Pr(>F)
## Model 5 1.02796 0.74886 2.3855 0.002 **
## Residual 4 0.34473 0.25114
## Total 9 1.37270 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1